138 research outputs found

    Prediction and evolution of transcription factors and their evolutionary families in prokaryotes

    Get PDF
    presentation abstractTranscription factors (TFs) play an important role in the genetic regulation of transcription in response to internal and external cellular stimulus even in a simple bacterium like Escherichia coli [1]. However little is known about their functional roles, expression dynamics and evolutionary scenarios on a large scale, even in a well studied model organisms. In this short tutorial, I will first talk about the prediction of transcription factors, which form the core of the regulatory repertoires in prokaryotes, responsible for controlling the expression of genes/transcription units by binding to their cis-regulatory regions. I will present different commonly used sequenced-based approaches to predict TFs in prokarya and discuss a simple rule of thumb to identify the putative regulatory role played by a TF based on its protein sequence alone [2-6]. I will discuss on some important properties of prokaryotic TFs which distinguish them apart from rest of the protein coding genes. The second part of the talk would concentrate on the evolutionary conservation of TFs and TF families across genomes and the implications of the observations on the phenotypic adaptation of species to different niches [3,7,8]. Finally, I will discuss some future perspectives in this area of research

    Principles of Interplay Between miRNAs and Transcription Factors in The Cancer Genome

    Get PDF
    Digitized for IUPUI ScholarWorks inclusion in 2021.miRNAs are small non-coding RNA that play a vital role in post-transcriptional gene regulation. They are involved in several important biological processes; hence their dysregulation has been associated with several diseases. In this study we propose a novel method to identify dysregulated miRNAs using tumor matched expression data. Applying the method to expression datasets of nine cancers from TCGA we identify dysregulated miRNAs in each of these cancers. In six cancers we see that more than 50% of the dysregulated miRNAs are up-regulated, suggesting a general trend of upregulation. We then identify transcription factors (TFs) that control the expression of dysregulated miRNAs in cancer by footprinting their upstream regions in order to build a high confidence transcriptional regulatory network contributing to the dysregulation of miRNAs. We observe that these TFs are predominantly responsible for up-regulation of miRNAs across cancers. In addition, we find that TFs that are identified in six or more cancers have different network centralities in the TF-Tf regulatory network when compared to TFs identified to contribute to dysregulation of miRNAs in a single cancer. Finally, we build cancer specific dysregulated TF-miRNA networks and identified several novel motifs including feedback loops involving TFs and miRNAs. These patterns of interactions show how TFs and miRNAs interact in a cancer specific manner and how dysregulation at one level affects the other

    Building Integrated Ontological Knowledge Structures with Efficient Approximation Algorithms

    Get PDF
    Publisher’s version made available under a Creative Commons license.The integration of ontologies builds knowledge structures which brings new understanding on existing terminologies and their associations. With the steady increase in the number of ontologies, automatic integration of ontologies is preferable over manual solutions in many applications. However, available works on ontology integration are largely heuristic without guarantees on the quality of the integration results. In this work, we focus on the integration of ontologies with hierarchical structures. We identified optimal structures in this problem and proposed optimal and efficient approximation algorithms for integrating a pair of ontologies. Furthermore, we extend the basic problem to address the integration of a large number of ontologies, and correspondingly we proposed an efficient approximation algorithm for integrating multiple ontologies. The empirical study on both real ontologies and synthetic data demonstrates the effectiveness of our proposed approaches. In addition, the results of integration between gene ontology and National Drug File Reference Terminology suggest that our method provides a novel way to perform association studies between biomedical terms

    Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches

    Get PDF
    Improved DNA sequencing methods have transformed the field of genomics over the last decade. This has become possible due to the development of inexpensive short read sequencing technologies which have now resulted in three generations of sequencing platforms. More recently, a new fourth generation of Nanopore based single molecule sequencing technology, was developed based on MinION® sequencer which is portable, inexpensive and fast. It is capable of generating reads of length greater than 100 kb. Though it has many specific advantages, the two major limitations of the MinION reads are high error rates and the need for the development of downstream pipelines. The algorithms for error correction have already emerged, while development of pipelines is still at nascent stage

    Human protein-RNA interaction network is highly stable across mammals

    Get PDF
    Background RNA-binding proteins (RBPs) are crucial in modulating RNA metabolism in eukaryotes thereby controlling an extensive network of RBP-RNA interactions. Although previous studies on the conservation of RBP targets have been carried out in lower eukaryotes such as yeast, relatively little is known about the extent of conservation of the binding sites of RBPs across mammalian species. Results In this study, we employ CLIP-seq datasets for 60 human RBPs and demonstrate that most binding sites for a third of these RBPs are conserved in at least 50% of the studied vertebrate species. Across the studied RBPs, binding sites were found to exhibit a median conservation of 58%, ~ 20% higher than random genomic locations, suggesting a significantly higher preservation of RBP-RNA interaction networks across vertebrates. RBP binding sites were highly conserved across primates with weak conservation profiles in birds and fishes. We also note that phylogenetic relationship between members of an RBP family does not explain the extent of conservation of their binding sites across species. Multivariate analysis to uncover features contributing to differences in the extents of conservation of binding sites across RBPs revealed RBP expression level and number of post-transcriptional targets to be the most prominent factors. Examination of the location of binding sites at the gene level confirmed that binding sites occurring on the 3′ region of a gene are highly conserved across species with 90% of the RBPs exhibiting a significantly higher conservation of binding sites in 3′ regions of a gene than those occurring in the 5′. Gene set enrichment analysis on the extent of conservation of binding sites to identify significantly associated human phenotypes revealed an enrichment for multiple developmental abnormalities. Conclusions Our results suggest that binding sites of human RBPs are highly conserved across primates with weak conservation profiles in lower vertebrates and evolutionary relationship between members of an RBP family does not explain the extent of conservation of their binding sites. Expression level and number of targets of an RBP are important factors contributing to the differences in the extent of conservation of binding sites. RBP binding sites on 3′ ends of a gene are the most conserved across species. Phenotypic analysis on the extent of conservation of binding sites revealed the importance of lineage-specific developmental events in post-transcriptional regulatory network evolution

    Genomic and mechanistic insights of convergent transcription in bacterial genomes

    Get PDF
    Digitized for IUPUI ScholarWorks inclusion in 2021.Convergent gene pairs with overlapping head-to-head configuration are widely spread across both eukaryotic and prokaryotic genomes. They are believed to contribute to the regulation of genes at both transcriptional and post-transcriptional levels, although the factors contributing to their abundance across genomes and mechanistic basis for their prevalence are poorly understood. In this study, we explore the role of various factors contributing to convergent overlapping transcription in bacterial genomes. Our analysis shows that the proportion of convergent overlapping gene pairs (COGPs) in a genome is affected by endospore formation, bacterial habitat and the temperature range. In particular, we show that bacterial genomes thriving in specialized habitats such as thermophiles exhibit a high proportion of COGPs. Our results also show that the density distribution of COGPs across the genomes is high for shorter overlaps with increased conservation of distances for decreasing overlaps. Our study also reveals that COGPs frequently contain stop codon overlaps with the middle base exhibiting mismatches between complementary strands. Functional analysis using COGs (Cluster of Orthologous groups) annotations suggested that cell motility, cell metabolism, storage, and cell signaling are enriched among COGPs suggesting their role in processes beyond regulation. Our analysis provides genomic insights into this unappreciated regulatory phenomenon, allowing a refined understanding of their contribution to bacterial phenotypes

    Identification and Genomic Analysis of Transcription Factors in Archaeal Genomes Exemplifies Their Functional Architecture and Evolutionary Origin

    Get PDF
    Archaea, which represent a large fraction of the phylogenetic diversity of organisms, are prokaryotes with eukaryote-like basal transcriptional machinery. This organization makes the study of their DNA-binding transcription factors (TFs) and their transcriptional regulatory networks particularly interesting. In addition, there are limited experimental data regarding their TFs. In this work, 3,918 TFs were identified and exhaustively analyzed in 52 archaeal genomes. TFs represented less than 5% of the gene products in all the studied species comparable with the number of TFs identified in parasites or intracellular pathogenic bacteria, suggesting a deficit in this class of proteins. A total of 75 families were identified, of which HTH_3, AsnC, TrmB, and ArsR families were universally and abundantly identified in all the archaeal genomes. We found that archaeal TFs are significantly small compared with other protein-coding genes in archaea as well as bacterial TFs, suggesting that a large fraction of these small-sized TFs could supply the probable deficit of TFs in archaea, by possibly forming different combinations of monomers similar to that observed in eukaryotic transcriptional machinery. Our results show that although the DNA-binding domains of archaeal TFs are similar to bacteria, there is an underrepresentation of ligand-binding domains in smaller TFs, which suggests that protein–protein interactions may act as mediators of regulatory feedback, indicating a chimera of bacterial and eukaryotic TFs’ functionality. The analysis presented here contributes to the understanding of the details of transcriptional apparatus in archaea and provides a framework for the analysis of regulatory networks in these organisms

    Epitranscriptomic code and its alterations in human disease

    Get PDF
    Innovations in epitranscriptomics have resulted in the identification of more than 160 RNA modifications to date. These developments, together with the recent discovery of writers, readers, and erasers of modifications occurring across a wide range of RNAs and tissue types, have led to a surge in integrative approaches for transcriptome-wide mapping of modifications and protein-RNA interaction profiles of epitranscriptome players. RNA modification maps and crosstalk between them have begun to elucidate the role of modifications as signaling switches, entertaining the notion of an epitranscriptomic code as a driver of the post-transcriptional fate of RNA. Emerging single-molecule sequencing technologies and development of antibodies specific to various RNA modifications could enable charting of transcript-specific epitranscriptomic marks across cell types and their alterations in disease

    Network-based approaches for linking metabolism with environment

    Get PDF
    Genome-wide metabolic maps allow the development of network-based computational approaches for linking an organism with its biochemical habitat

    Database of RNA binding protein expression and disease dynamics (READ DB)

    Get PDF
    RNA Binding Protein (RBP) Expression and Disease Dynamics database (READ DB) is a non-redundant, curated database of human RBPs. RBPs curated from different experimental studies are reported with their annotation, tissue-wide RNA and protein expression levels, evolutionary conservation, disease associations, protein-protein interactions, microRNA predictions, their known RNA recognition sequence motifs as well as predicted binding targets and associated functional themes, providing a one stop portal for understanding the expression, evolutionary trajectories and disease dynamics of RBPs in the context of post-transcriptional regulatory networks
    corecore